Neural Networks and the Time-Sliced Paradigm for Speech Recognition

نویسندگان

  • Ingrid Kirschning
  • Jun-Ichi Aoe
چکیده

The Time-Slicing paradigm is a newly developed method for the training of neural networks for speech recognition. The neural net is trained to spot the syllables in a continuous stream of speech. It generates a transcription of the utterance, be it a word, a phrase, etc. Combined with a simple error recovery method the desired units (words or phrases) can be retrieved. This paradigm uses a recurrent neural network trained in a modular fashion with natural connectionist glue. It processes the input signal sequentially regardless of the input's length and immediately extracts the syllables spotted in the speech stream. As an example, this character string is then compared to a set of possible words, picking out the five closest candidates. In this paper we describe the time-slicing paradigm and the training of the recurrent neural network together with details about the training samples. It also introduces the concept of natural connectionist glue and the recurrent neural network's architecture used for this purpose. Additionally we explain the errors found in the output and the process to reduce them and recover the correct words. The recognition rates of the network and the recovery rates for the words are also shown. The presented examples and recognition rates demonstrate the potential of the time-slicing method for continuous speech recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recent Advances in Continuous Speech Recognition Using the Time-Sliced Paradigm

We developed a method called Time-Slicing [1] for the analysis of the speech signal. It enables a neural network to recognize connected speech as it comes, without having to fit the input signal into a fixed time-format, nor label or segment it phoneme by phoneme. The neural network produces an immediate hypothesis of the recognized phoneme and its size is small enough to run even on a PC. To i...

متن کامل

The Time-Sliced Paradigm - A Connectionist Method for Continous Speech Recognition

In this paper a new method, called the Time-Slicing Paradigm, for the recognition of temporal patterns using neural networks is presented. This is a method for the analysis of the speech signal with the aim to achieve the recognition of connected speech with less preprocessing of the input signal than other existing neural networks. Along with the TimeSlicing Paradigm, this work also introduces...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002